® 



J 



Europaisches Patentamt 
Eur pean Pat nt Office 
Office uropeen des brevets 




(tj) Publication number: 0 496 492 B1 



12 



EUROPEAN PATENT SPECIFICATION 



@ Date of publication of patent specification : 
20.09.95 Bulletin 95/38 



@ Application number : 92300104.4 



© mt ci. 8 : H04N 5/782, G09B 7/04, 
G11B 15/02 



Date of filing: 07.01.92 



(54) Control apparatus for electronic equipment 



CO 

CM 
CD 

<£> 

o> 



a. 

LU 



@ Priority: 12.01.91 JP 13758/91 

(43) Date of publication of application : 
29.07.92 Bulletin 92/31 



(45) Publication of the grant of the patent : 
20.09.95 Bulletin 95/38 



(§4) Designated Contracting States : 
DE FR GB 



(56) References cited : 
EP-A- 0 075 026 
EP-A- 0 313 976 
EP-A- 0 369 430 
DE-A- 3 918 298 
GB-A- 2 220 290 
US- A- 4 333 152 

PATENT ABSTRACTS OF JAPAN vol. 12, no. 
385 (E-668)(3232) t 14 October 1988; & 
JP-A-63129725 



Proprietor: SONY CORPORATION 
7-35, Kitashinagawa 6-chome 
Shlnagawa-ku 
Tokyo (JP) 



@ Inventor : Tomitsuka, Hldeml 
c/o Sony Corporation 
7-35 Kitashinagawa 6-chome 
Shinagawa-Ku Tokyo (JP) 
Inventor : Tamura, Asako 
c/o Sony Corporation 
7-35 Kitashinagawa 6-chome 
Shinagawa-Ku Tokyo (JP) 
Inventor : Chigusa, Yasuhiro 
c/o Sony Corporation 
7-35 Kitashinagawa 6-chome 
Shinagawa-Ku Tokyo (JP) 
Inventor: Omori, Shiro 
c/o Sony Corporation 
7-35 Kitashinagawa 6-chome 
Shinagawa-ku Tokyo (JP) 

(74) Representative : Nicholls, Michael John et al 
J.A. KEMP & CO. 
14, South Square 
Gray's Inn 

London WC1R 5LX (GB) 



Note : Within nine months from the publication of the mention of the grant of the European patent, any 
person may give notice to the European Patent Office of opposition to the European patent granted. 
Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been 
filed until th opposition f e has been paid (Art. 99(1) European patent convention). 



Jouve, 18, rue Saint-Denis, 75001 PARJS 



BP 0 496 492 B1 



Description 

The present invention relates to a control apparatus and in particular to a control apparatus for electronic 
quipment for making a reservation of th video recording or for directly controlling the lectronic equipment 
5 such as video cassette recorder. 

Various recent home appliances such as video cassette recorder have multiple functions. Various switches 
for setting various operation modes are provided on the main body of the appliance. The operation is compli- 
cated. Even if the switches are few, the operation process is complicated. Accordingly, few users can freely 
use timer functions for making an (advance) reservation of the video recording of the VTR. 
10 Various approaches have been proposed to make the operation of the appliances easier. For example, a 

bar code reader can be used to simplify the input operation or it has been proposed that instructions and va- 
rious items of information can be inputted in natural language, that is, in voice of human being from a micro- 
phone. The approach relying on the bar code reader has a low freedom degree since bar codes labels which 
indicate various items of information of a program to be reserved have to be provided so the application range 
15 of this approach is limited. Therefore, it is considered that the approach relying on the natural language is fa- 
vorable. 

In a voice input operated apparatus in which operation instructions for designating the operation modes 
of the information recording/reproducing apparatus such as VCR or VTR can be inputted in human voice, a 
voice input device has been proposed which is capable of stably controlling the VCR by generating control 

20 commands to the VCR with reference to the status data of the information recording/reproducing apparatus. 

A control data input device has been proposed which is capable of controlling an object by analyzing the 
natural language which is formed of a combination of words representing a plurality of control instructions to 
provide the object to be controlled with a control instruction. 

A similar voice actuated control system is known from document GB-A-2 220 290. 

25 Although these techniques are able to set and control the operation mode of the appliances such as VTR, 
there is much room for improvement in response to an input. Only numerals and characters representative of 
the content of the reservation are displayed as data train on a display panel. Preliminary knowledge is nec- 
essary to deal with the case in which correction is made in the course of input or the complete reservation 
cannot be made and every user cannot easily operate the VCR. Many of users have difficulty accepting data 

30 comprising only a numeral train so an improvement in man-machine interface has been demanded. 

Therefore, present invention has been made under the above mentioned circumstances. It has as an object 
to provide a control apparatus of electronic equipment which make it easier to operate the appliance and which 
provide a natural man-machine interface. 

In order to accomplish the above mentioned object, the present invention provides a control apparatus for 

35 electronic equipment for designating the operation mode thereof, comprising: voice inputting means having 
an acoustic-electric transducer for voice inputting instructions and various items of information to designate 
the operation mode and for outputting an electric voice signal; voice recognition means for processing an output 
signal from the voice inputting means to recognize the instruction and various items of information; animation 
character generating means for outputting a video signal of an animation character who is a message speaker; 

40 video image display means for displaying the video signal from the animation character generating means; 
voice synthesizing means for synthesizing the voice signal of the message in response to a message signal 
input; voice outputting means for outputting the voice signal from the voice synthesizing means in voice; and 
control means responsive to at least the output signal from said voice recognition means for outputting an op- 
eration mode designation and control signal for designating the operation mode of said electronic equipment, 

45 an action control signal for controlling the action of the animation character in said means and a message signal 
instructing a message voice which is desired to be synthesized in said voice synthesizing means. 

In accordance with the present invention, electronic equipment is controlled in response to a voice input 
of natural language. At this time, a voice of message corresponding to a message signal is outputted from 
voice output means and an animation character displayed on video display means is moved in synchronization 

so with the message voice. Accordingly, it sounds as if the character spoke the message. Thus, the user feels as 
if the user were talking with the character. 

In a control apparatus for electronic equipment of the present invention, instructions and various items of 
the information for designating the operation modes can be inputted in voice and be converted into electrical 
signals and then processed to recognize the instructions and the various items of information. The operation 

55 mode of the electronic equipment is designated in response to the recognized voice inputs. An animation char- 
acter which will be a speaker of a message is displayed on vid o display means and the action of the character 
is controlled. A message voice is synthesized in response to the message signal output. The electronic equip- 
ment can be controlled in response to the voice input of natural language so that it is easier to control the elec- 
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tronic equipment. A natural man-machin interfac which mak s the users feet as if they were talking with th 
animation character and enables the users to easily operate the equipment can be realized. 

The invention will be further described byway of example only, with reference to the accompanying draw- 
ings in which: 

5 Fig. 1 is a block diagram showing the schematic structure of an embodiment of a control apparatus of the 

present invention; 

Fig. 2 is a flow chart showing a main operation of the control apparatus of the present invention; 
Fig. 3 is a schematic view showing a CRT screen on which an animation character and a balloon in an 
initial state are displayed; 
10 Fig. 4 is a block diagram showing a structure of a voice recognition circuit; 

Fig. 5 is a block diagram explaining the function of a control circuit; 

Fig. 6 is a flow chart showing the detail of the processing at each step in the flow chart of Fig. 2; 
Fig. 7 is a flow chart showing the detail of the reservation processing in the flow chart of Fig. 6; 
Fig. 8 is an illustration showing the CRT screen on which a reservation input request is displayed; 
15 Fig. 9 is an illustration showing the CRT screen on which reserved information confirmation is displayed; 

Fig. 10 is flow chart showing the detail of an elementary information input and an elementary information 
input processing in the flow chart of Fig. 7; 

Fig. 11 is a flow chart showing the detail of a lacking information processing in the flow chart of Fig. 7; 
Fig. 12 is a flow chart showing the detail of a duplication check processing in the flow chart of Fig. 7; 
20 Fig. 13 is a flow chart showing the detail of a display processing in the flow chart in Fig. 6; 

Fig. 14 is a flow chart following the flow chart of Fig. 13; 

Fig. 15 is a flow chart showing the detail of a change processing in the flow chart of Fig. 6; 

Fig. 16 is a flow chart showing the detail of a cancel processing in the flow chart of Fig. 6; 

Fig. 17 is a flow chart showing the detail of a VTR operation processing in the flow chart of Fig. 6; and 
25 Fig. 18 is a flow chart showing the detail of an idling animation processing in the flow chart of Fig. 6, 

Now, the preferred embodiments of the present invention will be described with reference to the drawings. 

A control apparatus for electronic equipment of the present embodiment is a control apparatus for elec- 
tronic equipment for controlling the selection of the operation mode of electronic equipment such as video tape 
recorder (VTR) 40 as shown in Fig. 1. In the present embodiment, the VTR 40 is used as the electronic equip- 
30 ment. The present embodiment will be described with reference to controls such as the selection control of 
various operation modes of the VTR 40 such as recording, playback, fast feeding and rewinding and recording 
reservation. 

In the apparatus of the present embodiment of Fig. 1, a handset 10 which is similar to the telephone hand- 
set is provided as voice input means for inputting instructions for selection control of operation modes and 

35 various items of information by voices. A transmitter unit 11 of the handset 10 is provided with an acoustic- 
electrical conversion element for converting an input voice into an electrical signal and outputting it. A press- 
to-tal k switch 1 2 for dividing in puts by switching the voice input mode is provided in the vicinity of the transmitter 
unit 11. An output signal from the transmitter unit 11 of the handset 10 is fed to a voice recognition circuit 13 
in which it is signal processed for recognizing the instruction and various items of information. An output signal 

40 from the switch 12 is fed to a switch state detecting unit 14 for detecting the on/off state of the switch. The 
voice recognition circuit 13 and the switch state detecting unit 14 are connected with a control circuit 15. With 
the control circuit 15 are connected an animation character generating circuit 16 for outputting a video signal 
of an animation character which will become a message speaker, a voice synthesizing circuit 19 for synthe- 
sizing the voice signal of the message in response to a message signal input by using a technique of ruled 

45 voice synthesizing, and a VTR controller 18 for controlling the selection of the operation mode of the VTR 40. 
A video signal from the character generator 16 is fed to a superimposer 17 in which it is superimpose upon 
the video signal from the VTR 40 and then the superimposed signal is fed to a cathode ray tube (CRT) display 
30. A voice signal from the voice synthesizing circuit 19 is fed to a speaker 20 in which it is converted into a 
sound. The speaker 20 and the CRT display 30 may be formed into an integral television receiver circuit. 

so The control circuit 15 outputs at least an operation mode selection control signal which controls the se- 

lection of the operation mode of the VTR 40 in response to an output signal from the voice recognition circuit 
1 3, an action control signal for controlling the action of the animation character AC of the animation character 
generating circuit 16, an message signal for instructing a message voice which is desired to be synthesized 
in the voice synthesizing circuit 19 and comprises a CPU such as a microprocessor. The control circuit 15 se- 

55 lects and outputs one control instruction which is appropriate for the current operation stat of the VTR 40 
from plurality of control instructions in response to an instruction for the s I ction control of the operation mod 
recognized by the voice recognition circuit 13. The selection processing and the operation mode of the VTR 
40 will be described hereafter in detail. The control circuit 15 outputs a message signal instructing a message 
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which is optimal as the content of a response in response to an voice input cont nt (and further in response 
to the current state) and feeds the output message signal to the animation character generating circuit 16 and 
the voice generating circuit 19. 

In Fig. 1 , the transmitter unit of the handset 1 0 converts a voice pronounced by a man such as an operator 

5 operating the VTR 40, that is, an user) into an electrical voice signal. The voice pronounced by the operator 
includes pronounced instruction words, such as "playback", "stop", "record", "pause'*, "slow", "fast-feeding", 
"rewind" and pronounced words such as the day of the week (Sunday, to Monday), every day, every week, 
channel (channels 1 to 12), starting/ending time (morning, afternoon, 0 to 12 o'clock) for making a reservation 
of video recording. In the present embodiment, the other words "Hey", "O.K.", "No" than the above mentioned 

10 words can be inputted as voices. 

The press-to-talk switch 12 for designating the voice input state disposed in the handset 10 instructs the 
division of words pronounced by the operator by turning on or off by the operator. In other words, the press- 
to-talk switch 12 is provided to divide an input voice signal of a sentence comprising a plural continuous discrete 
words into units to be processed by voice recognition in the voice recognizing circuit 13. An output from the 

15 press-to-talk switch 12 is fed to the switch state detecting unit 14 which is provided in parallel with the voice 
recognition circuit 13. The switch state detecting unit 14 generates a state instruction signal instructing the 
current on/off state in response to an output signal from a press-to-talk switch 12. The state instruction signal 
assumes states "0" and "1° when the press-to- talk switch 12 is inoperative and operative states, respectively. 
Accordingly, if voice recognition is desired to be conducted, the press-to-talk switch 12 is turned on and is 

20 turned off after completion of voice input. The voice recognition in processing units is thus performed in re- 
sponse to the corresponding state instruction signal. Accordingly, it will suffice for the voice recognition circuit 
1 3 not to analyze whether or not the input is completed from the turning off (state "0") representative for the 
completion of the voice word input when continuous word inputting is performed by voice. In other words, the 
voice recognition circuit 13 can determine clear starting and completion time when the voice is recognized. 

25 The range in which the voice can be recognized can be easily determined by a software so that it will need 
not to perform an unnecessary voice recognition processing for the noises outside this range. Since control 
of the handset 10 is not performed, noises on switching of voice input (on cutting off of voices) will not be in- 
putted. 

A sense of reluctance or incompatibility to speak to a machine can be mitigated by feeding a voice or 

30 speech output signal from the voice synthesizing circuit 19 to the receiver unit of the handset 1 0 and performing 
input and output of voice via the handset 1 0 which is similar in shape to the telephone handset and malfunction 
on use at a noisy place can be prevented. Responded voice can be shielded from co-receiving listeners by 
feeding the responded voice signal to only the receiver unit of the handset 10 without feeding responded voice 
to the speaker 20 in the TV set or stereo receiver. Video recording reservation presetting operation and VTR c 

35 operation can be performed without interfering the other co-receiving listeners. 

A flow chart of main operations in the apparatus of the present embodiment is shown in Fig. 2. In Fig. 2, 
an animation character is called in initial state after the power to the apparatus has been turned on by carrying 
out the voice input or the other input operation at step S1 by an operator (user), the animation character AC 
as shown in Fig. 3 is displayed on a screen SC of the CRT display 30 and the apparatus is brought into a stand- 

40 by state of a voice input by the operator. That is, the animation character AC and a balloon SP for character 
displaying the content spoken by the animation character are superimposed upon the video signal from the 
VTR 40, a background video image and displayed. A message same as the message in the balloon SP (for 
example, a message "What can I help you?" representing the stand-by state) is synthesized in the voice syn- 
thesizing circuit 1 9 simultaneously with the display of the message on the screen SC and the voice (for example 

45 "What can I help you?") is generated from the speaker 20. A calling processing at the step S1 is specifically 
exemplarily shown. When the name of the animation character AC (for example, Ivy) is called or a call "Hei" 
is made, or the power of the VTR 40 is turned, the animation character AC is displayed together with a message 
"Yes, here is Ivy." More natural atmosphere of dialogue is obtained by subsequently shifting to the state of 
Fig. 3. 

so The operator (user) voice inputs an instruction and various items of information for the selection control 

of the operation mode at step S2. If the instruction is an instruction to activate the system for running the VTR 
40, the program step proceeds to step S3, the apparatus is brought into a phase to directly control the operation 
mode of the VTR 40. If the instruction is to reserve the video recording, the apparatus is brought into a mode 
of the video reserving operation of the VTR 40. If the instruction is to confirm the reservation, the program 

55 step will proceed to a step S5 in which the apparatus is brought into a mode to confirm the reserved content. 
In the mode to confirm the reservation, the program step will proceed to the step 6 subsequently of the step 
5 in which processing to change or cancel the reservation is made. In the modes to res rve the video recording 
and to confirm the reservation, more sophisticated conversation betwe n the operator and the animation char- 
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acter AC is carried out as will b described. After completion of processing at st ps S3, S4 and S5 (S6) t th 
apparatus is returned into th stand-by state of a voice input. 

In an example of display in Fig. 3, the animation character AC is displayed in the left and lower area on 
the screen SC of th CRT display 30. If the animation character AC is display d in the cent r of th screen 

5 SC, it will become a visual obstacle to a video image (background video image BG) displayed on the screen. 
An animation character AC displayed in the lower and left of the screen will not become an obstacle for the 
image on the screen. A character is displayed in the lower area of the screen SC, the legs of the character AC 
will not be levitated so that it will give a stable feeling. If the character AC is displayed on the lower and left 
side of the screen SC, a sense of incompatibility is less from the feeling of human being. The balloon SP to 

10 character display a message in the screen SC is displayed in the lower and right areas of the screen SC if the 
characters of the message are an array in a horizontal direction from the left to right and in a vertical direction 
from the upper to lower side, respectively. This will scarcely obstruct the image (background video image BG) 
on the screen SC. 

An animation action control signal is fed to the animation character generating circuit 16 from the control 

15 circuit 15 so that the action of the animation character AC is controlled. The animation character generating 
circuit 16 which receives the action control signal from the control circuit 15 outputs to the superimposer 17 
a video signal which causes the animation character AC as shown in Fig. 3 which is a speaker talking with the 
operator to move his or her mouth in response to the voice output of the message. The animation signal includes 
a signal of a balloon SP for character displaying the message. The superimposer 17 superimposes the video 

20 image of the animation character AC upon the reproduced video image from VTR 40 and feeds it to the CRT 
display 30. The animation character AC displayed on the screen of the CRT display 30 is made a personified 
expressive character having familiarity. This causes the operator (user) to have a feeling as if he or she were 
talking with the animation character (or electronic equipment). 

The voice recognition circuit 13 in Fig. 1 recognizes the above mentioned instructions and various items 

25 of information by processing the supplied voice signal and may be various in structure. A voice recognition 
circuit having a structure as shown in Fig. 4 will be described. 

In Fig. 4, an output signal (input voice signal) from the transmitter unit 11 of the handset 10 is supplied to 
the input terminal 21 and the input voice signal is fed to an operational processor 23 via an analog interface 
22. After the analog interface 22 changes the input voice level to a given level depending upon a control data 

30 supplied from a system controller 24 and then converts the voice signal into serial digital voice signals and 
feeds them to the operational processor 23. The operational processor 23 forms a voice pattern by frequency 
analyzing the inputted digital voice signals and corrects (time axis normalizes) the time distortion of the voice 
pattern due to changes in speaking speed of human voice and compares the time axis normalized voice pattern 
with a plurality of standard patterns preliminarily stored in a standard pattern storing memory 25 for conducting 

35 a so-called pattern matching processing. The pattern matching processing is to calculate the distances be- 
tween the detected voice pattern and each of the standard patterns and to determine a standard pattern which 
has the shortest distance from the detected voice pattern. The result of processing is fed to the control circuit 
1 5 including the CPUs via the interface 26. A plurality of patterns of the above mentioned words of instructions 
such as "playback", "stop" and various items of information such as "Sunday", "channel 1" are stored in the 

40 standard pattern storing memory 23. The operational processor 23 recognizes the words by determining which 
the voice pattern of the input voice signal is of the plurality stored patterns (to which the voice pattern of the 
input voice signal is the nearest). Although the plurality of standard voice patterns may be stored in the standard 
pattern storing memory 25 by a manufacturer, or alternatively prior to starting to use the apparatus, these pat- 
terns may be stored in the memory 25 by successively voice inputting the words by an operator and by fre- 

45 quency analyzing the input voice by the operational processor 23 to form the voice patterns. 

The output signal from the voice recognition circuit 13 is fed to the control circuit in which it is subjected 
to a natural language input processing or inference dialogue processing. 

An natural language input processing production system 32 functions in response to a voice input 31 from 
the user to form a meaning frame 33 depending upon the number of the reserved programs. The meaning frame 

50 33 is fed to the inference dialogue production system 34 to preset and control a video recording reserving 
scheduler 35. The natural language input processing production system 32 can be divided into a sentence nor- 
malizing production system PS1, dividing production system PS2, a language extracting production system 
PS3, and meaning understanding production system PS4. The inference dialogue production system 34 can 
b divided into an interpolation inference production system PS5 t a custom learning production syst m PS6 

55 and a dialogue processing production system PS7. Slots of a plurality of items such as an item of information 
on the day of the week, an it m of information on channel, an it m of information on starting time, an item of 
information on recording period of time or ending time are provided as video recording reserving information 
in one meaning frame 33. Various items of information which have been voice inputted in the r servation proc- 
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essing ar written into corresponding item slots. 

Fig. 6 is a flow chart explaining the processing at ach step in the flow chart in Fig. 2. A case in which the 
video recording reservation and operation of the running system of the VTR 40 is shown. 

That is, an operation to call the animation charact r AC at st p S1 in Fig. 2 is carried out at st p S11 in 
5 Fig. 2. The call at step S11 is made in response to an voice input of calling instruction "IVY" or "Hey" or the 
turning on of the power to the VTR 40 as described at step S1 in Fig. 2. When a call for the animation character 
AC is inputted at the step S 11, the program step will proceed to next step S12 and the animation character 
AC as shown in Fig. 3 is displayed on the screen SC of the CRT display 30 (activation of the animation character 
AC). 

10 Determination whether the press-to-talk switch 12 is turned on or off is made at step S13. If the switch is 

turned off, the program step will proceed to step S15. When an instruction from the operator is voice inputted, 
determination whether or not the input instruction is an instruction to reserve the video recording is made at 
step S16. If the instruction is an instruction to reserve the video recording (Yes), the processing of reservation 
of the video recording is carried out at step S17 and then the program step will return to step S15. If No (the 

is input instruction is not to reserve the video recording) at step S16, the program step will proceed to step S18 
at which determination whether or not the input instruction is an instruction to display the content of the video 
recording reservation which has been already carried out is made. If Yes at step S18, processing to display 
the previous content of video recording reservation on the screen SC of the CRT display 30 is carried out at 
steps 19 and thereafter the program step will return to the step S15. If No, the program step will proceed to 

20 step 20. Determination whether or not the input instruction from the operation in an instruction to change the 
video recording reservation is made at step 20. If Yes at step S20, the video recording reservation is changed 
at step S21 and then the program step will return to step S15. If No, the program step will proceed to step 
S22. Determination whether or not the input instruction is an instruction to cancel the video recording reser- 
vation is made at step S22. If Yes, processing to cancel the video recording reservation is made at step S23 

25 and the program step will return to step S15. If No, the program step will proceed to step S24. Determination 
whether or not the input instruction is an instruction to operate the animation character AC is made at step 
S24. If Yes, the program step will return to the step S15 after the animation character AC is operated. The 
operation of the animation character specifically includes deletion of the animation character AC on the screen 
SC and cancellation of voices. If No at step S24, the program step will proceed to step S26, Determination 

30 whether or not the instruction is to operate the VTR 40 is made at step S26. If Yes, processing to operate the 
VTR 40 is carried out at step S15. If no, processing is terminated. 

If there has been no voice instruction input from the operator within a given period of time at step S14, an 
appearance in which the animation character AC does not know what to do with himself is displayed. Specif- 
ically, actions in which the animation character AC firstly yawns and then leans upon the left end of the screen 

35 or scratch himself on his own head are made. If there has been no voice instruction input subsequently, an 
action in which the animation character AC lies is made. Processing for this action will be hereafter referred 
to as processing for idling action. 

Fig. 7 is a flow chart explaining the details of the reserving processing at step S17 of Fig. 6. That is, in 
Fig. 7, a reservation input request representative of waiting for information on the video recording reservation 

40 is displayed at step S50. The animation character AC and a character message "Please, make a reservation" 
in the balloon are displayed on the screen SC of the CRT display 30. At this time, a synthesized voice "Please, 
make a reservation" is pronounced and simultaneously with this, a moving picture in which the mouth of the 
animation character AC moves is displayed. In an example of Fig. 8, characters "reservation" representing the 
current mode are displayed in a character display window PH in the right and upper position of the screen SC. 

45 Voice input of various items of information such as the day of the week, channel, starting and ending time 

to make a reservation of the video recording of the VTR 40 is performed at next step S51 in Fig, 7 while the 
display as shown in Fig. 8 (and voice output) is carried out. A plurality of items of information can be inputted 
in desired order once at the step S51. Accordingly, input processing relying upon the natural language input 
processing production system 32 as shown in Fig. 5 is carried out at step s51. After each processing of nor- 

50 malization and division of a sentence, word extracting and meaning understanding has been performed, a plur- 
ality of items of information are classified into corresponding slots of items of the meaning frame 33 such as 
an item of information on the day of the week, an item of information on starting time, an item of information 
on the recording period of time or ending time. 

After performing the information input processing at the S52, determination whether or not there is any 

55 insufficiency of el mental information is made st p S53. 

Words "insufficiency of information" means that all items of information ar not provided in the above- 
mentioned meaning frame. When four items of information such as the day of the week, the channel, the start- 
ing time, and the ending time (or the recording period of time) are provid d in the meaning frame on reservation 
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of video recording, a normal video r cording r servation cannot be carried out if one of four items is lacking. 

If it is determined that the elem ntal information is lacking (Yes) at st p S53, an inquiry on the lacking ele- 
m ntal information is made at step S66 and the program step will return to step S51. Complementary proc- 
essing of the lacking information will be described h reafter with reference to Fig. 11. 

5 If it is determined at step S53 in Fig. 7 that there is no lack of information (No), the program step will proceed 

to the step S54 at which each inputted elementary information is confirmed. A display as shown in Fig. 9 is 
made on the screen SC of the CRT display 30 on confirmation of the elemental information at step S54. The 
animation character AC and a character message "Is it OK?" in the balloon SP are displayed as is similarly to 
the case of Fig. 8 and synthesized voice "Is it OK?" is pronounced simultaneously with the mouth of the ani- 

10 mation character AC. Characters "Reservation" representing the current phase is displayed in the character 
display window H on the screen SC. As the confirmation display of Fig. 9, a window PR for displaying video 
recording reservation contents is provided in the center of the screen SC. Each item of data in the meaning 
frame such as data on the date and the day of the week "Tuesday, Nov. 7th " the data on the starting and ending 
times °3 a.m. to 4 a.m." and the data on the channel "channel 6" are displayed in the display window PR. 

15 Voice input for confirming the reservation at step 54 is carried out at next step S55 in Fig. 7 while such a 
display of Fig. 9 is conducted. Voice input of instructions such as "Yes", "Return", "No" and "Change" or the 
elementary information is carried out. It is of course that the natural language input processing as shown in 
Fig. 5 is also conducted at this time. Determination whether or not the voice input at step S55 is "Yes" is made 
at step S56. If it is determined "Yes" (voice "Yes" is inputted), the program step will proceed to step S67 at 

20 which it is checked whether or not video recording reservation is duplicated. If it is determined Yes at step S56 
(other than "Yes"), program step will proceed to step S57. At step S57, determination whether voice input at 
step S55 is "Return" is made at the step S57. If it is determined at step S68 "Yes", determination whether or 
not an inquiry on lack of the elementary information is made. If Yes or No at step S68, the program step will 
return to steps S68 or S50, respectively. If the determination at the step S57 is No, the program step will pro- 

25 ceed to the step S58. Determination whether or not voice input at the step S55 is elementary information is 
made at the step S58. If the voice input is elementary information (Yes), the program step will return to the 
step S52. If the voice input is a voice input other than the elementary information (No), the program step will 
proceed to the step S59. Determination whether or not the voice input at step S55 is "Change" is made at step 
S59. If No, the program step will proceed to step S60. Selection whether each item of elementary information 

30 of the video recording reservation is changed or canceled is made at step S60. Accordingly, any input of 
change/cancel is carried out at step S61 and the program step will proceed to step S62. Determination whether 
or not the voice input is "change" is made again at step S62. If No, the video recording reservation is stopped 
at step S69. If Yes at step S62, the program step will proceed to step S63 at which the content of change is 
inquired. After elementary information is inputted again at step S65, the program step will return to step S52. 

35 Fig. 10 is a flow chart showing the detail of the elementary information input at step S51 of Fig. 7 and the 

elementary information input processing at step S52. 

Fig. 10 shows the flow of action in consideration of that the natural language "Return" has a plurality of 
meanings and different controls are necessary depending upon the current condition. In other words, if ele- 
mentary information is inputted in Fig. 10, the meaning of the inputted elementary information is analyzed at 

40 step S71 and the program step will proceed to step S72. Determination whether or not the inputted voice is 
"Return" is made at step S72. If Yes, the program will proceed to step S73. Determination whether or not the 
just previous inputted voice is "Return" is made at step S73. If Yes at step S73, the program step will proceed 
to step S75 at which processing to stop the reservation is performed and then the processing is terminated. 
If it is determined No at step S73, the program step will proceed to step S74 at which determination whether 

45 or not there is any question immediately before is made. If No at step S74, the program step will proceed to 
step S75 at which the reservation is stopped. If Yes, the program step will proceed to step S76. At step S76, 
an inquiry just before is made and the program step will return to the elementary information input step of step 
S51 of Fig. 7. If it is determined No at step S72, the program step will proceed to step S77. Determination wheth- 
er or not there is any error is made at step S77. If Yes, the program step will proceed to step S79 at which error 

50 item is inquired. Then the program step will return to step S51 at which elementary information is inputted. If 
it is determined No at steps 77, the program step will proceed to step S78 at which writing into the meaning 
frame is performed and then the processing is terminated. 

If a word which has a plurality of meanings like "Cancel" or "OK" other than "Return" is voice inputted, a 
natural language processing program is incorporated so that the meaning of the inputted word can b correctly 

55 interpreted depending upon circumstance to perform processing, that is, to enabl to process the diverse mean- 
ings of a word. This can b achieved by incorporating the current condition into the object of det rmination as 
shown in Fig. 10. This enables the operator to operate the apparatus by the natural language which is used 
in ordinary life of human being. 
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The voic input "R turn" or "Cancel" has various meanings also in a phas in which the feeding system 
_ of th VTR 40 is directly operated at step S3 of Fig. 2 or the phase of reservation confirmation at step S5 (and 
S6). Ar sponse which is appropriate depending upon the circumstance is taken depending upon the circum- 
stanc . This will b describ d her aft r. 

5 Fig. 11 is a flow chart showing the detail of the processing of the lacking information in Fig. 7. Fig. 7 shows 

each step in which presence or absence of various lacking information and inquiry are collected for simplicity 
of illustration. In Fig. 11, these are developed depending upon four items of the meaning frame and a step de- 
pending upon absence or presence of the elementary information (hereinafter referred to as HLS data) which 
is obtained by learning the custom is added. 

10 Inotherwords, inthe processing for complementing the lacking information of Fig. 11, determination wheth- 

er the item of the information on the data and the day of the week of the video recording is present or absent 
is made at the first step S33. If present or absent, the program step will proceed to the step S36 or S34, re- 
spectively. The data on the video recording to be reserved is inquired by the animation display and the voice 
of the animation character AC at step S34. The information on the date is inputted at step S35. Determination 

15 whether the item of the information on the channel of the video recording reservation is present or absent is 
made at step S36. If present or absent, the program step will proceed to step S36 or S37, respectively. Which 
channel is to be selected is inquired at step S37. The information on the channel is inputted at step S38. 

The present embodiment has a system (HLS) for learning the custom of making a reservation of the video 
recording in the VTR 40 by the operator. In the HLS system, when the information on the data and the channel 

20 (or only the date) are inputted determination whether or not the information the date and the channel is the 
same as the data of customarily reserved video recording (HLS data) is made at step S39. If it is the HLS data, 
at least the starting and ending time of the customarily reserved program is displayed (and voice inputted) 
and whether or not the customary video recording reservation is made is confirmed at step S46. While such 
a display and confirmation are made, voice "Yes" or "No" is inputted at step S47. Determination whether or 

25 not the customary video recording is made is made at step S48. If Yes or No, the program step will proceed 
to step S49 orS41. 

If it is determined that the information on the data and the channel is not the customary HLS data, the 
program step will proceed to step S40. Determination whether the information on the starting time of the video 
recording is present or absent is made at step S40. If present or absent, the program step will proceed to step 

30 S43 and S41 , respectively. What time is the starting time of the reserved video recording is inquired at the 
step S41. The information on the starting time is inputted at step S42. After the information on the starting 
time is inputted, determination whether the ending time of the reserved video recording or the recording period 
of time is prese nt or absent is made at step S43. If present, the program step will proceed to step S49. If absent, 
the program step will proceed to step S44. An inquiry on what time is the ending time of the reserved video 

35 recording is made at step S44. The information on the ending time or the recording period of time is inputted 
at step S45. Display as shown in Fig. 9 (and a voice output "Is it OK?" is performed at step S49. Confirmation 
whether or not each item of the information for the video recording reservation which was inputted at each 
step is correct is made at step S49. 

Fig. 12 is a flow chart explaining the detail of the duplication checking processing at step S67 of Fig. 7. In 

40 Fig. 12, the above-mentioned reservation content is displayed for confirmation at step S81 (corresponding to 
the step S54 of Fig. 7). Yes is voice inputted to initiate the duplication confirmation processing at step S82 
corresponding to steps S55 and S56 of Fig. 7). Accordingly, the step S83 and the subsequent steps correspond 
to the duplication processing of step S67 of Fig. 7. 

Determination whether or not the information on the previous video recording reservation includes the res- 

45 ervation information in which time is duplicated is made at step S83 of Fig. 12. If No, reservation of video re- 
cording is completed and the processing is terminated at step S92. If Yes, the program step will proceed to 
step S84. Display of the duplicated video recording reservation and selection of change/cancel of the dupli- 
cated video recording reservation is carried out at step S84. Any of change/cancel is voice inputted at step 
S85. Determination whether or not the voice input is "change" is made at step S86. If No, the reservation of 

so the video recording is stopped to terminate the processing at step S93. If Yes, the content of change is inquired 
at step S87. The elementary information is voice inputted at step S88. If the elementary information is inputted 
at step S88, the meaning analyzing processing is made at step S89. Determination whether or not an error of 
the meaning analysis of the voice input is present or absent is made at step S90. If present, the error item is 
inquired at step S94 and the program step will return to step S88. If it is determined that the error is absent at 

55 step S90, the program step will proceed to step S91 . After carrying out the slot writing into corresponding in- 
formation item of the meaning frame, th program step will return to step S81. 

The error processing at steps S90 and S94 and the error processing at steps S77 and S79 are in principle 
performed whenever voice input is made. The errors to be processed mainly include a grammatical error in 
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which mismatching occurs in th meaning analyzing routine and an input error in which designation of th te- 
rn ntary information is wrong (or the designation is not wrong, but is wrong for the specification of the VTR). 

In case of the grammatical error, a message "Not understood. Please speak once more" is displayed and 
voic outputted. The apparatus is brought into a stand-by state. 
5 In case of the input error, a message "Not understood. What date will you make a reservation V or the 

starting time is the same as the ending time. Please input the time again " which points out the error item of 
the inputted sentence to request a rein put is displayed or voice outputted so that the apparatus is brought into 
an input stand-by state. 

Figs. 13 and 14 are flow charts explaining the detail of a display processing routine at step S19 of Fig. 6. 

w When the display processing at step S1 9 of Fig. 6 is commenced, voice input of a display instruction, 

change instruction and cancel instruction, etc. is performed at step S 100 of Fig. 13. Determination whether 
or not there is information on the video recording reservation is made at step S101. If not, an error message 
representing that there is no reservation information to be displayed is displayed and voice outputted to ter- 
minate the processing. If it is determined that there is information on the video recording reservation at step 

15 S1 01. After the content of the video recording reservation is displayed at step S103, determination whether 
or not there is information on the reservation which has not been displayed is made at step S104. If it is de- 
termined at step S104 that there is no reservation information which has not been displayed, the program step 
will proceed to step S113. If it is determined at step S104 that there is reservation information, whether or not 
the screen of the CRT display 30 is switched is inquired at step S105. "Yes" , "No" and "Return" by voice or 

20 instructions of change or cancel are inputted at step S1 06, determination whether or not the inputted voice is 
"Yes" is made at step S108. If it is determined Yes at step S108, the program step will proceed to step S115. 
Determination whether or not there is next information for the screen is made. If it is determined Yes at this 
step S115, the program step will proceed to step S116 at which next information is displayed on the screen 
and the program step will return to the step S105. If it is determined No at step S115, the program step wiil 

25 proceed to step S117 at which an error message is displayed and will return to step S1 05. If the determination 
at step S 105 is No, the program step will proceed to step S109. Determination whether or not the voice input 
is "Return" is made at step S109. If Yes, the program step will proceed to step S118. Determination whether 
or not previous information is displayed on the screen is made at the steps 11 8. If Yes, the previous information 
is displayed on the screen at step S119 and the program step wiil return to step S105. If No, the processing 

30 is terminated. If the determination at the step S109 is No, the program step S110 will proceed to step S110. 
Determination whether or not the voice input is an instruction of change is made at step S1 1 0. If Yes, the change 
processing is performed at step S120. If No, the program step wiil proceed to step S111. Determination whether 
or not the voice input is an instruction of cancel is made at step S111 . If Yes, the cancel processing is performed 
at step S121. If No, the program step will proceed to step S112. Determination whether or not these change 

35 and cancel processings are commenced from the instruction of display is made at step S121. If No, the program 
step will return to step S110. If Yes, the program step will proceed to step S113. Which is performed change 
or cancel is inquired at step S113 and voice "Yes", "No" and "Return" or a change instruction or cancel instruc- 
tion is inputted at step S114. 

If processing at step S1 14 of Fig. 13 is terminated, the program step will proceed to the step S125 shown 

40 in Fig. 14. Determination whether or not voice input at step S114 is an instruction of change is made at the 
step S125. If Yes, the processing is terminated after performing the processing for change at step S131. If No, 
the program step will proceed to step S126. Determination whether or not the voice input at step S114 is a 
cancel instruction is made at the step S126. If Yes, the processing is terminated after performing the cancel 
processing at step S132. If No, the program step wiil proceed to step S127. Determination whether or not the 

45 voice input at step S127 is "Yes" is made at step S127. If the determination is No, the processing is terminated. 
If the determination is Yes, the program step will proceed to step S128. Which is selected, change or cancel 
is inquired at the step S128 and the change instruction or cancel instruction or "Return" in voice is inputted 
at step S129. Thereafter, the program step will proceed to step S130. Determination whether or not the voice 
input is "Return" is made at this step. If Yes, the processing is terminated. If No, the program step will return 

so to the stepS 125. 

Fig. 15 is a flow chart explaining the detail of the change processing routine at step S21 of the flow chart 
of Fig. 6. 

In Fig. 15, selection processing for video recording reservation is performed at step S140. Specifically, a 
schedule of the video recording reservation is displayed and a message inquiring which reservation is changed 
55 is displayed and voice outputted. If there are four content indicating columns in the displayed schedul , the 
video recording information in which column is to be changed is voice inputted, for example, first, fourth at 
next step S141. Determination whether or not the actually video r cording reservation information is in the col- 
umn which is specified by voice input is made at next step S142. If No at step S142, an error message is dis- 
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played at step S150 and the program st p will r turn to step S140. If Yes, th program step will proceed to 
step S143. 

The content to be chang d of the designated video recording reserved content is inquired at step S143. 
The el mentary information on the content to b changed is voice inputted at st p S144. Confirmation proc- 
5 essing of the video recording reserved content which has been changed by the elementary information is per- 
formed at step S145. "Yes", "No" or elementary information in voice is inputted at step S146. Determination 
whether or not the voice input at the step S146 is the elementary information is made at step S 147. If Yes, the 
program step will return to step S 145. if No, the program step will proceed to step S148. Determination whether 
or not the voice input is Yes is made at step S148. If Yes, the duplication check processing is performed at 
10 step S151 and then the processing is terminated. If the determination at the step S148 is No, the program step 
will return to step S144 after the content to be changed is inquired again at step S149. 

Fig. 16 is a flow chart explaining the detail of the cancel processing routine at step S23 of the flow chart 
of Fig. 6. 

In Fig. 16, the video recording reservation information for cancellation is selected at step S1 60. Which item 
15 of video recording reservation information of four items of (the first to fourth) the recording reservation infor- 
mation should be canceled is voice inputted at step S161. Determination whether or not there are the first to 
fourth items of the video recording reservation information is made at step S162. If No at the step S162, an 
error message is outputted at step S168 and the program step will return to step S162. If Yes, the program 
step will proceed to step S163. The processing at steps S160 to S162 in Fig. 16 is substantially identical with 
20 that at steps S140 to S142 in Fig. 15. 

Next, confirmation processing for cancellation is performed at step 1 63. 'Yes" or "No" is voice inputted at 
step S164. Determination whether or not the voice input is "Yes" is made at step S1 64. If the determination is 
Yes, the program step will proceed to step S166 at which the processing is terminated after completion of the 
cancellation. If No at step S165, the program step will proceed to step S167 at which the cancel processing is 
25 stopped to terminate the processing. 

In the above mentioned video recording reservation or reservation confirmation (and change/cancel) 
phase, an input other than the requested answer is accepted and understood and is apparent from the steps 
S54 to S59 of Fig. 7, steps S105 to S111 of Fig. 13, steps S113 to S127 of Fig. 13, steps S145 to S148 of Fig. 
15. 

30 In other words, display shown in Fig. 9 (and the voice output) is performed at step S54 of Fig. 7 and an 

answer "Yes" or "No" is generally requested, it is determined "Yes" at step S58 by directly voice inputting the 
elementary information so that the program can be proceed to the elementary information input processing at 
step S52. And the program step can be shifted to a change processing by voice inputting "Change". 

A message inquiry whether or not the display is switched is displayed and voice outputted at step S105 

35 of Fig. 13. Although, 'Yes" or "No" is usually answered, the program step can be shifted to the change proc- 
essing step S120 and the cancel processing step S 121 by directly inputting the change instruction and cancel 
instruction, respectively. 

This aims at performing the optimum processing suitable for the inputted content by accepting an input 
other than the requested answer in consideration of that an answer which is logically jumped from the words 

40 of the other person may be made by presuming the meaning from the context in usual conversation among 
human being. This enables the program to directly proceed to next input step by omitting an answer "Yes" or 
"No". The simplification of the operation procedure can be achieved. Accordingly, a plurality of procedures to 
enter a give processing exist. For example, if the video recording reserved content is desired to change, it is 
an orthodox procedure to respond each question that "No" is answered in response to a question "Is it OK?" 

45 on confirmation and then "Change" is instructed in response to a question "What can I help you?" An instruction 
"Change" may be inputted on confirmation and the content to be changed may be directly inputted and the 
program step can be returned to a just previous information input step by an instruction "Return". The program 
can be flexibly responded to the various inputs by the operator. The elementary information of the item to be 
changed can be changed by directly inputting, for example, "Channel 6". An environment or atmosphere of 

so dialogue by natural language in which obvious matter cannot be expressly referred can be obtained as well 
as the simplification of the operation procedure. 

Fig. 1 7 is a flow chart explaining the detail of an operation processing routine of the VTR 40 at step S26 
of Fig. 6. 

In Fig. 17, the current mode of the VTR 40 is checked (for example, what the operation mode is) at step 
55 S171 . The operation corresponding to a command responsive to an operation mode designating control signal 
is retrieved from a matrix (a table showing the relation between instructions and operations) for operation 
modes which will be describ d hereafter. A part of the matrix is shown in Table 1 . 



10 



EP 0 496 492 B1 



Table 1 
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30 

Determination whether or not there is a corresponding operation in the matrix is made in next step S173. 
If no, an error message is outputted to terminate the processing at step S176. If Yes at step S173, the program 
step will proceed to step S174 in which an operation corresponding to the command is executed. Thereafter, 
a message that an operation will be executed or has been executed is outputted to terminate the processing 
35 at step S1 75. 

Table 1 shows an example in which to which operation mode the current operation mode is to be shifted 
if an instruction by the voice input is outputted in the current operation mode. That is, an instruction "Power" 
is voice inputted when the current operation mode is, for example, a power off mode, the operation mode is 
brought into a power-on operation mode. If an instruction "Playback" is voice inputted when the current oper- 

40 ation mode is a power off, and stop mode, the operation mode is brought into a playback mode. If an instruction 
"Fast'' is voice inputted when the current operation mode is, for example, stop, fast feeding, and cue modes, 
the operation mode is brought into a fast feeding mode. If the current mode is playback and pause modes, the 
operation mode is brought into the cue mode. If the current mode is rewind and review modes, the operation 
mode is brought into the review and rewind mode, respectively. If "Return" is voice inputted when the current 

45 operation mode is the stop, fast feeding and pause modes, the operation mode is brought into the review mode. 
If the current operation mode is the payback and cue modes, the operation mode is brought into the reverse 
playback mode. If the current operation mode is the rewind and review modes, it is brought into the fast feeding 
and the review modes, respectively. In such a manner, a correct operation depending upon the current oper- 
ation mode is selected in response to the voice input "Fast" or "Return" having plural meanings. 

so Fig. 18 is a flow chart explaining the detail of the idling animation processing routine at step S14 of the 

flow chart of Fig. 6. 

Determination whether or not the conversation is in the top level is made at step S181 of Fig. 18. The top 
level means the level of conversation in which the basic operation of the VTR 40 is (directly) performed or the 
apparatus is brought into the above mentioned reservation/confirmation phases. If it is determined Yes (the 
55 conversation is in the top level) at step S181, the idling animation imag for the top I v I conversation is se- 
lected at step S186, th program step will proceed to step S183. If it is de term in d No at step S181, the idling 
animation image for the processing mode in each phase is selected at step S182. It is possible to preset various 
idling animation video imag for the top level and the processing mode. For example, idling action in which 
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the animation character AC yawns, leans on the dge of the sere n, scratches himself on his own head, or 
lies down, may be performed as mentioned above. After any one of the actions of the idling animation images 
is s lected by the random number generation, the program step will proceed to step S184. Determination 
wh ther or not a given period of th waiting time in which there is no voice input has passed is made at step 

~5 S184. If No, this determination is repeated. If Yes, the program step will proceed to step S185. The idling ani- 
mation image of the selected pattern is displayed on the screen SC of the CRT display 30. That is, an action 
representing that the animation character AC is idling is performed. 

In order to make the response of the animation character AC more familiar, it is preferable to make some 
response to a voice input which is not related with operation mode designation of the equipment Specifically, 

10 "Can I help you?" is answered in response to the voice input "Hey". An animation image in which the character 
AC is shy is displayed simultaneously with a reply "I am embarrassed" in response to voice input "Wonderful!". 

An interface between the control circuit 15 of Fig. 1 and the animation character generating circuit 16 or 
the voice synthesizing circuit 1 9 is implemented by a message packet MP having a given structure. An example 
of the message packet MP may comprise three elements such as a message type representing the kind of 

15 the message, the message number corresponding to each content of the message, and the status. The mes- 
sage type indicates any of a massage with no input (refer to Figs. 3 and 8), a message with an input (refer to 
Fig. 9), a scheduler packet, a list (a schedule of the reserved content) and a duplication list. The message num- 
ber can specify any of 256 messages when 8 bits are used. That is, the control circuit 15 selects an appropriate 
message type or kind (content) depending upon the recognition data of the inputted voice and the current status 

20 and puts the message type and the message number into the message packet MP for feeding them to the 
animation character generating circuit 16 and the voice synthesizing circuit 19. The animation character gen- 
erating circuit 16 and the voice synthesizing circuit 19 displays the moving mouth of the animation character 
and characters of the message and voice synthesizes the message corresponding to the message number in 
the fed message packet MP, respectively. 

25 It is necessary to preliminarily prepare a plurality of messages in order to perform the display and voice 

output of the message via the message number. One message among the plurality of messages is selected 
for display and voice output. Alternatively, the message perse may be composed by the control circuit 15 and 
the composed message may be fed by one character by one character for display and voice synthesizing. 
The natural language input processing will now be described. 

30 When the video recording is reserved, an inference of the lacking item of the time information is conducted 

within common sense. For example, when only minute is designated on input of starting time, the time is in- 
ferred from the current time. When the hour and minutes are designated on input of the ending time, this is 
determined as the video recording period of time from the starting time. When a plurality of elementary infor- 
mation is inputted, half past 8 to just 9 o'clock is inferred in response to a voice input, for example, thirty minutes, 

35 until 9 o'clock in consideration of the relation between "from" and "until". If the current hour is 10 minutes after 
eight o'clock, 30 minutes from eight o'clock to 30 minutes past 9 o'clock is inferred in response to a voice input 
"one hour from 30 minutes". 

It is to be understood that the present invention is not limited to only the foregoing embodiments. The voice 
inputting means may be a hand microphone in lieu of the handset. A remote control unit may be provided with 

40 a small size microphone. The hand microphone or remote control unit may be provided with a switch (press- 
to-talk switch). The controlled electronic equipment is not limited to only VTR, but is applicable to a control for 
various devices such as disk recording and/or playback apparatus and digital or analog audio tape recorder. 



45 Claims 

1. A control apparatus for electronic equipment (40) for designating the operation mode thereof, comprising: 
voice inputting means (10) having an acoustic-electric transducer (11) for inputting voice instruc- 
tions and various items of information to designate the operation mode and for outputting an electric voice 
so signal; 

voice recognition means (13) for processing an output signal from the voice inputting means to rec- 
ognize the instruction and various items of information; 

control means (15,18) responsive to the output of the voice recognition means (13) for outputting 
an operation mode designation and control signal for designating the operation mode of the electronic 
55 equipment (40), and a display (30) for displaying information in response to the instruction, characterized 

by: 

animation character gen rating means (1 6) for outputting a video signal of an animation character 
(AC) who is a message speaker, and in that the display means (30) is a video image display means (30) 
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for displaying the video signal from the animation character generating means (16); 

voice synthesizing means (19) for synthesizing a voice signal of a message in response to a mes- 
sage signal input; 

voice outputting means (20) for outputting the voice signal from the voice synthesizing means (1 9) 
5 in voice; and in that 

the control means (15,18) are also responsive to at least the output signal from said voice recog- 
nition means for outputting an action control signal for controlling the action of the animation character 
in said means and a message signal instructing a message voice which is desired to be synthesized in 
said voice synthesizing means (19). 

10 

2. A control apparatus for electronic equipment as defined in claim 1 in which said electronic equipment (40) 
is a video tape recorder and the operation mode is a video recording reservation mode. 

3. A control apparatus for electronic equipment as defined in claim 2 in which if normal reserved video re- 
15 cording is unable due to a fact that there is an error in the item of information on the date or the item of 

information of the channel, the error item is pointed to request a reinputfor bringing said video tape re- 
corder into an input stand-by state. 

4. A control apparatus for electronic equipment as defined in claim 1, 2 or 3 in which said control means 
20 causes the control apparatus (15) to perform a control action in response to an input when the input has 

a content different from the currently requested input content. 

5. A control apparatus for electronic equipment as defined in claim 1, 2, 3, or 4 in which said control means 
selects and outputs one control instruction appropriate for the current operation mode of said electronic 

25 equipment (40) among a plurality of control instructions in response to an instruction for designating the 

operation mode which is recognized by said voice recognition means (13). 

6. A control apparatus for electronic equipment as defined in claim 5 in which said electronic equipment (40) 
is a video tape recorder, and if the instruction for designating the operation mode recognized by said voice 

30 recognition means is a word meaning "Fast" when the current operation mode of said video tape recorder 

is a "stop" or "playback" mode, said control means brings the video tape recorder into a "fast feeding" mode 
or "fast playback" mode, respectively. 

7. A control apparatus for electronic equipment as defined in claim 5 or 6 in which said electronic equipment 
35 (40) is a video tape recorder, and if the instruction for designating the operation mode recognized by said 

voice recognition means is a word meaning "Return" when the current operation mode of said video tape 
recorder is a "stop" or "playback" mode, said control means brings the video tape recorder into a "rewind" 
mode or "reverse playback" mode, respectively. 

40 8. A control apparatus for electronic equipment as defined in any one of the preceding claims in which said 
video display means (30) displays an animation character (AC), and a balloon (SP) for displaying the char- 
acters of the message. 

9. A control apparatus for electronic equipment as defined in claim 8 in which said video display means (30) 
45 further includes a superimposer for superimposing said animation character (AC) upon a reproduced or 

broadcast video image. 

10. A control apparatus for electronic equipment as defined in any one of the preceding claims in which said 
voice inputting means (10) has a switch (12) for designating the voice input state and in which voice input 

so is enabled or disabled by turning the switch (12) on or off. 



Patentanspruche 

55 1. Steuervorrichtung fur eine elektronische Einrichtung (40) zur Bestimmung seiner Betriebsart, mit 

einer Spracheingabeeinrichtung (10) mit einem elektro-akustischen Wandler (11) zur Eingab von 
Sprachinstruktionen und verschiedenen Informal ion sbeitragen, urn die Betriebsart zu bestimmen und um 
ein elektrisches Sprachsignal auszugeben; 
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in r Spracherk nnungseinrichtung (13) zur Verarbeitung eines Ausgangssignals von der Spra- 
cheingabeeinrichtung, urn die Instruktion und die verschiedenen Informations be it rage zu erkennen; 

einer Steuereinrichtung (15, 18), die auf das Ausgangssignal der Spracherkennungseinrichtung 
(13) reagiert, um in B triebsartbestimmungs- und Steuersignal auszugeben, um di Betriebsart der 
elektronischen Einrichtung (40) zu bestimmen, und einer Anzeige (30), um die Information in Abhangigkeit 
von der Instruktion anzuzeigen, gekennzeichnet durch: 

eine Trickzeichenerzeugungseinrichtung (16) zur Ausgabe eines Videosignals eines Trickzeichens 
(AC), das ein Nachrichtensprecher ist, und wobei die Anzeigeeinrichtung (30) eine Videobildanzeigeein- 
richtung (30) ist, um das Videosignal von derTrickzeichenerzeugungseinrichtung (16) anzuzeigen; 

eine kunstliche Sprachaufbaueinrichtung (19) zum kunstlichen Aufbau eines Sprachsignals einer 
Nachricht in Abhangigkeit von einem Nachrichteneingangssignal; 

eine Sprachausgabeeinrichtung (20) zursprachlichen Ausgabe des Sprachsignals von der kunst- 
lichen Sprachaufbaueinrichtung (19); und daft 

die Steuereinrichtung (15, 18) aufterdem auf zum indest das Ausgangssignal von der Spracherken- 
nungseinrichtung reagiert, um ein Handlungssteuersignal zur Steuerung der Handlung des Trickzeichens 
in der Einrichtung auszugeben, und ein Nachrichtensignal, das eine Nachrichtensprache anweist, dafi ste 
in der Sprachaufbaueinrichtung (19) kunstlich aufgebaut werden soil. 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 1, bei der die elektronische Einrich- 
tung (40) ein Videobandrekorder ist und die Betriebsart ein Videoaufzeichnungsreservierungsmodus ist. 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 2, bei dem, wenn eine normale re- 
servierte Videoaufzeichnung unmoglich aufgrund der Tatsache ist, daft ein Fehler in dem Informations- 
beitrag bezuglich des Datums oder des Informationsbeitrags bezuglich des Kanals vorhanden ist, der 
Fehler herausgestellt wird, um eine nochmalige Eingabe anzufordern, um den Videobandrekorder in einen 
Eingabe-Stand-by-Zustand zu bringen. 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 1, 2 Oder 3, bei der die Steuerein- 
richtung die Steuervorrichtung (15) veranlaftt, eine Steuerhandlung in Abhangigkeit von einem Eingangs- 
signal durchzuf uhren, wenn das Eingangssignal einen Inhalt hat, der sich von dem laufend angeforderten 
Eingabeinhalt unterscheidet 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 1, 2, 3 oder 4, bei der die Steuerein- 
richtung eine Steuerinstruktion auswahlt und ausgibt, die zur laufenden Betriebsart der elektronischen 
Einrichtung (40) pafct unter einer Vielzahl von Steuerinstruktionen in Abhangigkeit von einer Instruktion 
zur Bestimmung der Betriebsart, die durch die Spracherkennungseinrichtung (13) erkannt wird. 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 5, bei der die elektronische Einrich- 
tung (40) ein Videobandrekorder ist, und, wenn die instruktion zur Bestimmung der Betriebsart, die durch 
die Spracherkennungseinrichtung erkannt wird, ein Wort ist, das "schneir bedeutet, wenn die laufende 
Betriebsart des Videobandrekorders ein "Stopp-" oder eine "Wiederga be"- Betriebsart ist, die Steuerein- 
richtung den Videobandrekorder in eine "schnelle Vorlauf'-Betriebsart bzw. "schnelle Wiedergabe"-Be- 
triebsart bringt. 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 5 oder 6, bei der die elektronische 
Einrichtung (40) ein Videobandrekorder ist, und wenn die Instruktion zur Bestimmung der Betriebsart, die 
durch die Spracherkennungseinrichtung erkannt wird, ein Wort ist, das "kehre zuruck" bedeutet, wenn die 
laufende Betriebsart des Videobandrekorders ein "Stopp-" oder eine "Wiedergabe^-Betriebsart ist, die 
Steuereinrichtung den Videobandrekorder in eine "Rucklauf -Betriebsart oder eine "Umkehrwiederga- 
be"- Betriebsart bringt. 

Steuervorrichtung fiir eine elektronische Einrichtung nach einem der vorhergehenden Anspruche, bei der 
die Videoanzeigeeinrichtung (30) ein Trickzeichen (AC) anzeigt und eine Sprachblase (SP) zur Anzeige 
der Nachrichtenzeichen. 

Steuervorrichtung fur eine elektronische Einrichtung nach Anspruch 8, bei der die Videoanzeigeeinrich- 
tung (30) weiter eine Uberlagerungseinrichtung zur Uberlagerung des Trickzeichens (AC) bei einem wie- 
dergegebenen oder gesendeten Videobild aufweist. 
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10. St uervonrichtungfureineelektronisch Einrichtungnacheinemdervorhergehend nAnspruche.b ider 
di Spracheingabeeinrichtung (10) inen Schalter (12) aufweist, urn den Spracheingabezustand zu be- 
stimmen, und bei der die Spracheingabe durch Drehen des Schalters (12) auf "em^oder'aus" eingeschal- 
tetoderabg schaltet wird. 



Revendlcattons ( 

1. Appareildecomrrtande pourun Equipement Electronique (40) pourdEsignerson mode defonctionn erne nt, 
w comprenant : 

un moyen d'entree de parole (10) comportant un transducteur acousto-electrique (11) pour entrer 
des instructions de parole et divers elements d'information pour designer le mode de fonctionnement et 
pour Emettre en sortie un signal de parole electrique ; 

un moyen de reconnaissance de la parole (13) pour trailer un signal de sortie provenantdu moyen 
is d'entree de parole af in de reconnartre instruction et les divers elements d'information; 

un moyen de commande (15, 18) sensible a la sortie du moyen de reconnaissance de la parole 
(13) pour Emettre en sortie un signal de designation de mode de fonctionnement et de commande pour 
designer le mode de fonctionnement de ('Equipement Electronique (40), et un aff ichage (30) pour aff icher 
une information en reponse a instruction, 
20 caractErisE par : 

un moyen de generation de caractere d'animation (16) pour emettre en sortie un signal video d'un 
caractere d'animation (AC) qui est un locuteur de message et en ce que le moyen d'aff ichage (30) est 
un moyen d'aff ichage d'image vidEo (30) pour aff icher le signal video provenant du moyen de generation 
de caractere d'animation (16) ; 
25 un moyen de synthese de la parole (19) pour synthEtiser un signal de parole d'un message en re- 

ponse a une entree de signal de message; 

un moyen de sortie de parole (20) pour emettre en sortie le signal de parole provenant du moyen 
de synthese de la parole (19) selon une parole; et en ce que 

le moyen de commande (15, 18) est Egalement sensible a au moins le signal de sortie provenant 
30 dudit moyen de reconnaissance de la parole pour emettre en sortie un signal de commande d'action pour 

commander Taction du caractere d'animation dans ledit moyen et un signal de message donnant en ins- 
truction une parole de message que I'on souhaite voir synthEtiser dans ledit moyen de synthese de la 
parole (19). 

35 2. Appareil de commande pour un Equipement electronique selon la revendication 1 , dans lequel ledit Equi- 
pement electronique (40) est un enregistreur a bande video et le mode de fonctionnement est un mode 
reservation d'enregistrement video. 

3. Appareil de commande pour un Equipement Electronique selon la revendication 2, dans lequel, si un en- 
40 registrement video reserve normal est indisponible du fait qu'il y a une erreur au niveau de relement d'in- 

formation concernant la date ou au niveau de I' element d'information du canal, I'EIEment en erreur est 
pointe pour demander en requete une rE-entrEe pour amener ledit enregistreur a bande vidEo dans un 
etat d'attente d'entrEe. 



45 4. Appareil de commande pour un Equipement Electronique selon la revendication 1, 2 ou 3, dans lequel 
ledit moyen de commande force I'appareii de commande (15) a realiserune action de commande en re- 
ponse a une entree lorsque I'entrEe prEsente un contenu diffErent du contenu d'entrEe prEsentement de- 
mandE en requete. 

so 5. Appareil de commande pour un Equipement Electronique selon la revendication 1 2, 3 ou 4, dans lequel 
ledit moyen de commande sEiectionne et Emet en sortie une instruction de commande qui convient pour 
le mode de fonctionnement courant dudit Equipement Electronique (40) parmi une pluralitE destructions 
de commande en rEponse a une instruction pour dEsigner le mode de fonctionnement qui est reconnu 
par ledit moyen de reconnaissance de la parole (13). 

55 

6. Appareil de commande pour un Equipement Electronique selon la revendication 5, dans lequel ledit Equi- 
pement Electronique (40) est un nregistreur a bande vidEo et si instruction pour dEsigner le mode de 
fonctionnement reconnu par ledit moyen de reconnaissance de la parol st un mot signif iant "rapide" 
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lorsqu le mode de fonctionnem nt courantdudit enregistreur a bande vidEo est un mode "arrEt" ou "I c- 
ture", leditmoyen de command amene respectivementl'enregistreur a bande video dans un mode "avan- 
ce rapide" ou dans un mode "lecture rapide". 

5 7. Appareil de commande pour un Equipement Electronique selon la revendication 5 ou 6, dans iequel ledit 
Equipement electronique (40) est un enregistreur a bande video et si I'instruction pour designer le mode 
de fonctionnement reconnu par iedit moyen de reconnaissance de la parole est un mot signif iant "retour" 
lorsque le mode de fonctionnement courant dudit enregistreur a bande video est un mode "arret" ou un 
mode "lecture", ledit moyen de commande amene respectivement 1'enregistreur a bande video dans un 

10 mode "rebobinage" ou dans un mode "lecture inverse". 

8. Appareil de commande pour un Equipement electronique selon Tune quelconque des revendications pre- 
cedentes, dans Iequel ledit moyen d'aff ichage video (30) aff iche un caractere d'animation (AC) et une 
bulls (SP) pour aff icher les caracteres du message. 

15 

9. Appareil de commande pour un Equipement Electronique selon la revendication 8, dans Iequel leditmoyen 
d'aff ichage vidEo (30) inclut en outre un dispositif de superposition pour superposer ledit caractere d'ani- 
mation (AC) sur une image video reproduce ou diff usee. 

20 1 0. Appareil de commande pour un Equipement Electronique selon Tune quelconque des revendications pre- 
cEdentes, dans Iequel ledit moyen d'entrEe de parole (10) comporte un commutateur (12) pour dEsigner 
I'Etat d'entrEe de parole et dans Iequel une entrEe de parole est validEe ou invalidEe en activant ou en 
dEsactivant le commutateur (12). 

25 
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